Spoken word recognition without a TRACE
نویسندگان
چکیده
How do we map the rapid input of spoken language onto phonological and lexical representations over time? Attempts at psychologically-tractable computational models of spoken word recognition tend either to ignore time or to transform the temporal input into a spatial representation. TRACE, a connectionist model with broad and deep coverage of speech perception and spoken word recognition phenomena, takes the latter approach, using exclusively time-specific units at every level of representation. TRACE reduplicates featural, phonemic, and lexical inputs at every time step in a large memory trace, with rich interconnections (excitatory forward and backward connections between levels and inhibitory links within levels). As the length of the memory trace is increased, or as the phoneme and lexical inventory of the model is increased to a realistic size, this reduplication of time- (temporal position) specific units leads to a dramatic proliferation of units and connections, begging the question of whether a more efficient approach is possible. Our starting point is the observation that models of visual object recognition-including visual word recognition-have grappled with the problem of spatial invariance, and arrived at solutions other than a fully-reduplicative strategy like that of TRACE. This inspires a new model of spoken word recognition that combines time-specific phoneme representations similar to those in TRACE with higher-level representations based on string kernels: temporally independent (time invariant) diphone and lexical units. This reduces the number of necessary units and connections by several orders of magnitude relative to TRACE. Critically, we compare the new model to TRACE on a set of key phenomena, demonstrating that the new model inherits much of the behavior of TRACE and that the drastic computational savings do not come at the cost of explanatory power.
منابع مشابه
A time-invariant connectionist model of spoken word recognition
One of the largest remaining unsolved mysteries in cognitive science is how the rapid input of spoken language is mapped onto phonological and lexical representations over time. Attempts at psychologically-tractable computational models of spoken word recognition tend either to ignore time or to transform the temporal input into a spatial representation. This is the approach taken in TRACE (McC...
متن کاملSimple Recurrent Networks and human spoken word recognition
A crucial problem in cognitive science, especially for speech processing, is sequence encoding. Models of spoken word recognition either ignore the problem (e.g., Norris et al., 2000), posit solutions incapable of representing repeated elements (e.g., Grossberg & Kazerounian, 2011), or ”spatialize” time in possibly unrealistic ways (TRACE; McClelland & Elman, 1986). An alternative that has not ...
متن کاملSimple Recurrent Networks and Competition Effects in Spoken Word Recognition
Continuous mapping models of spoken word recognition such as TRACE (McClelland and Elman, 1986) make robust predictions about a wide variety of phenomena. However, most of these models are interactive activation models with preset weights, and do not provide an account of learning. Simple recurrent networks (SRNs, e.g., Elman, 1990) are continuous mapping models that can process sequential patt...
متن کاملThe influence of the phonological neighborhood clustering coefficient on spoken word recognition.
Clustering coefficient-a measure derived from the new science of networks-refers to the proportion of phonological neighbors of a target word that are also neighbors of each other. Consider the words bat, hat, and can, all of which are neighbors of the word cat; the words bat and hat are also neighbors of each other. In a perceptual identification task, words with a low clustering coefficient (...
متن کاملWithin-category VOT affects recovery from "lexical" garden paths: Evidence against phoneme-level inhibition.
Spoken word recognition shows gradient sensitivity to within-category voice onset time (VOT), as predicted by several current models of spoken word recognition, including TRACE (McClelland & Elman, Cognitive Psychology, 1986). It remains unclear, however, whether this sensitivity is short-lived or whether it persists over multiple syllables. VOT continua were synthesized for pairs of words like...
متن کامل